Power Law Correlations in DNA Sequences
نویسندگان
چکیده
Introduction Awide variety of natural phenomena is characterized by power law behavior of their parameters. This type of behavior is also called scaling. The first observation of scaling probably goes back to Kepler who empirically discovered that squares of the periods of planet revolution around the Sun scale as cubes of their orbits radii. This empirical law allowed Newton to discover his famous inverse-square law of gravity. In the nineteenth century, it was realized that many physical phenomena, for example diffusion, can be described by partial differential equations. In turn, the solutions of these equations give rise to universal scaling laws. For example, the root mean square displacement of a diffusing particle scales as the square root of time. In the twentieth century, power laws were found to describe various systems in the vicinity of critical points. These include not only systems of interacting particles such as liquids and magnets but also purely geometric systems, such as random networks. Scaling is also found to hold for polymeric systems, including both linear and branched polymers.4 Since then, the list of systems characterized by power laws has grown rapidly including models of rough surfaces, turbulence and earthquakes. Empirical power laws are found to characterize also many physiological, ecological, and socio-economic systems. These facts give rise to the increasingly appreciated “fractal geometry of nature”. A major puzzle concerning genomes of eukaryotic organisms, is that the large percent of their DNA is not used to code proteins or RNA. In human genome, this “junk” DNA constitutes 97% of the total genome length which is equal to 3 billion nucleotides also called base-pairs (bp). The role of non-coding DNA is poorly understood. It seems that it evolves by its own laws not restricted by a specific biological function. These laws are based on probabilities of various mutations and as such resemble the laws governing other complex systems listed above. In this chapter, I will review the degree to which power laws can characterize fluctuating nucleotide content of the DNA sequences, see also a critical review of W. Li.16 The term “long range correlations” is often misunderstood, implying some mystical long-range interactions or information propagation in space. Therefore, I will start with a brief introduction in the theory of critical phenomena, in which this concept has been developed. An impatient reader can jump directly to section “Correlation Analysis of DNA Sequences”.
منابع مشابه
Mosaic organization of DNA nucleotides.
Long-range power-law correlations have been reported recently for DNA sequences containing noncoding regions. We address the question of whether such correlations may be a trivial consequence of the known mosaic structure ("patchiness") of DNA. We analyze two classes of controls consisting of patchy nucleotide sequences generated by different algorithms--one without and one with long-range po...
متن کاملFractal landscapes in biological systems: long-range correlations in DNA and interbeat heart intervals.
Here we discuss recent advances in applying ideas of fractals and disordered systems to two topics of biological interest, both topics having common the appearance of scale-free phenomena, i.e., correlations that have no characteristic length scale, typically exhibited by physical systems near a critical point and dynamical systems far from equilibrium. (i) DNA nucleotide sequences have tradit...
متن کاملFractal landscape analysis of DNA walks.
By mapping nucleotide sequences onto a "DNA walk", we uncovered remarkably long-range power law correlations [Nature 356 (1992) 168] that imply a new scale invariant property of DNA. We found such long-range correlations in intron-containing genes and in non-transcribed regulatory DNA sequences, but not in cDNA sequences or intron-less genes. In this paper, we present more explicit evidences ...
متن کاملCompositional segmentation and long-range fractal correlations in DNA sequences.
A segmentation algorithm based on the Jensen-Shannon entropic divergence is used to decompose longrange correlated DNA sequences into statistically significant, compositionally homogeneous patches. By adequately setting the significance level for segmenting the sequence, the underlying power-law distribution of patch lengths can be revealed. Some of the identified DNA domains were uncorrelated,...
متن کاملComment on "Linguistic features of noncoding DNA sequences"
In a recent letter [1], Mantegna et. al. report that certain statistical signatures of natural language can be found in non-coding DNA sequences. The vast majority of DNA in higher organisms including humans consists of non-coding sequences whose function , if any, is unknown. Hence this new analysis is quite important. It suggests, as the authors concluded , " the possible existence of one (or...
متن کاملThe Lack of Long Range Correlations is a Necessary Condition for a Functional Biologically Active Protein
We study random heteropolymer chain with gaussian distribution of kinds of monomers. The long-range correlations between kinds of monomers were introduce. The mean-field analysis of such heteropolymer indicates the existence of infinite energetic barrier between heteropolymer random coil and frozen states. Thus, the frozen state is kinetically unavailable for the random heteropolymer with power...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005